WHIRL: A word-based information representation language
نویسنده
چکیده
We describe WHIRL, an \information representation language" that synergistically combines properties of logic-based and text-based representation systems. WHIRL is a subset of non-recursive Datalog that has been extended by introducing an atomic type for textual entities, an atomic operation for computing textual similarity, and a \soft" semantics; that is, inferences in WHIRL are associated with numeric scores, and presented to the user in decreasing order by score. We show that WHIRL strictly generalizes both ranked retrieval of documents, and logical deduction; that non-trivial queries about large databases can be answered eeciently; that WHIRL can be used to accurately integrate data from heterogeneous information sources, such as those found on the Web; that WHIRL can be used eeectively for inductive classiication of text; and nally, that WHIRL can be used to semi-automatically generate extraction programs for structured documents.
منابع مشابه
A New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملWord Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction
The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملComparative Effect of Visual and Auditory Teaching Techniques on Retention of Word Stress patterns: A Case Study of English as a Foreign Language Curriculum in Iran
This study aimed at investigating the effect of visual (Cuisenaire Rods) and auditory nonsensical monosyllables using Pratt speech processing software as teaching techniques on retention of word stress. To this end, 60 high school participants made the two experimental groups of the study each having 30 students on the basis of their proficiency scores on KET (Key English Test). In one experime...
متن کاملWHIRL in ProbLog
We present how WHIRL can be modelled as a ProbLog program using ProbLog’s Python interface to execute information retrieval algorithms using standard toolkits such as scikit-learn and the natural language toolkit.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Artif. Intell.
دوره 118 شماره
صفحات -
تاریخ انتشار 2000